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Time-varying excitatory and inhibitory synaptic inputs govern activity of neurons and 
process information in the brain. The importance of trial-to-trial fluctuations of synaptic 
inputs has recently been investigated in neuroscience. Such fluctuations are ignored in 
the most conventional techniques because they are removed when trials are averaged 
during linear regression techniques. Here, we propose a novel recursive algorithm based 
on Gaussian mixture Kalman filtering (GMKF) for estimating time-varying excitatory and 
inhibitory synaptic inputs from single trials of noisy membrane potential in current clamp 
recordings. The KF is followed by an expectation maximization (EM) algorithm to infer 
the statistical parameters (time-varying mean and variance) of the synaptic inputs in a 
non-parametric manner. As our proposed algorithm is repeated recursively, the inferred 
parameters of the mixtures are used to initiate the next iteration. Unlike other recent 
algorithms, our algorithm does not assume an a priori distribution from which the synaptic 
inputs are generated. Instead, the algorithm recursively estimates such a distribution by 
fitting a Gaussian mixture model (GMM). The performance of the proposed algorithms is 
compared to a previously proposed PF-based algorithm (Paninski et al., 2012) with several 
illustrative examples, assuming that the distribution of synaptic input is unknown. If noise 
is small, the performance of our algorithms is similar to that of the previous one. However, 
if noise is large, they can significantly outperform the previous proposal. These promising 
results suggest that our algorithm is a robust and efficient technique for estimating time 
varying excitatory and inhibitory synaptic conductances from single trials of membrane 
potential recordings. 
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INTRODUCTION 

Interaction of the excitatory and inhibitory synaptic inputs con- 
structs the shape of the receptive fields and can elucidate the 
synaptic mechanism underlying the functional activities of neu- 
rons. Therefore, inferring synaptic inputs from neuronal record- 
ings is an important topic of interest in neuroscience (Shu et al., 
2003; Wehr and Zador, 2003; Priebe and Ferster, 2005; Murphy 
and Rieke, 2006; Ozeki et al, 2009; Haider et al, 2013). In many 
cases, intercellular recordings of membrane potential (or cur- 
rent) under pharmacological blockade spiking activities are used 
to estimate synaptic inputs. Estimating synaptic inputs based on 
averaging over many trials and linear regression fitting, which 
is commonly used, is not always the best methodology because 
the trial-to-trial variations of synaptic inputs are ignored. The 
significance of such variations in understanding the neuronal 
mechanisms of the brain activity (especially spontaneous) and 
their key roles in information processing is well reviewed in 
(Destexhe and Contreras, 2006). 

The main scope of this paper is to develop an efficient recur- 
sive algorithm for estimating synaptic inputs from single trials 



of recorded membrane potential. We point out two recent stud- 
ies (Kobayashi et al., 2011; Paninski et al., 2012) that have used 
the well-known Bayesian approach to infer synaptic inputs in 
single trials. In both studies, promising results were reported 
under low observation noise. Kobayashi et al. (2011) considered 
the Ornstein-Uhlenbeck stochastic model with time-dependent 
mean and variance as the neuronal model. Kalman filtering 
(KF) was then used to track these statistical quantities from 
recorded membrane potential. Paninski et al. (2012) used a 
compact neuronal model associated with two differential equa- 
tions representing the dynamics of the excitatory and inhibitory 
synaptic conductances. Then the sequential Monte-Carlo method 
[particle filtering (PF)] was derived for filtering/smoothing the 
dynamics of the model. Finally, an expectation maximization 
(EM) algorithm (in both parametric and non-parametric man- 
ner) was used to infer the time-varying mean of the synap- 
tic conductances. Since the above-mentioned studies used the 
Bayesian approach, the distributions of synaptic inputs need to 
be known as a priori knowledge. This is the major theoreti- 
cal drawback of these methods because synaptic distributions 
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are unknown in real neurons. Moreover, Kobayashi et al. (2011) 
assumed that all excitatory or inhibitory synaptic weights are 
identical to obtain an explicit relation between the excitatory and 
inhibitory synaptic inputs vs. the mean and variance of input 
current, and this assumption does not necessarily hold in real 
neurons. 

The difficulty in estimating the time course of both excita- 
tory and inhibitory synaptic inputs from only a single trial of 
the recorded data as compared with other conventional meth- 
ods (averaging and estimating the mean of synaptic inputs) is 
that the problem is underdetermined since two unknown vari- 
ables have to be estimated at each time instant. We propose a 
robust recursive algorithm, based on Gaussian mixture Kalman 
filtering (GMKF), for filtering/smoothing the dynamics of a com- 
pact neuronal model (including synaptic conductances) followed 
by an EM algorithm to infer the statistical parameters of such 
synaptic inputs. Our methodology provides more degrees of free- 
dom for these inputs by estimating their distributions with a 
Gaussian mixture model (GMM). Accordingly, as we are deal- 
ing with Gaussian distribution for each mixand, KF is considered 
as an optimal filtering, which is also faster and easier than the 
PF approach (Paninski et al., 2012). Once the neuron dynam- 
ics are estimated, we can simply (in a closed form) infer the 
time-varying mean and variance of the synaptic inputs using 
a non-parametric spline method. Our major contribution in 
this paper is the development of a general framework for esti- 
mating time-varying synaptic conductances when there is no 
pre-assumption about the synaptic conductances dynamics, e.g., 
small changes of amplitudes upon presynaptic spikes (Kobayashi 
et al, 2011), exponential distribution of the synaptic input, 
or exponential non-linearity to describe the presynaptic input 
(Paninski et al., 2012). Note that, in the special case of a single 
Gaussian distribution, our algorithm reduces to the standard KF 
used in a recursive framework. The simulation results demon- 
strate the accuracy and robustness of our both KF- and GMKF- 
based algorithms compared to the PF-based algorithm. While the 
proposed general GMKF-based algorithm exhibits accurate and 
robust performance over the entire range of parameters studied, 
our KF-based algorithm exhibits fast and simple estimation in 
many representative scenarios. Thus, our general GMKF-based 
algorithm is a promising tool in neuroscience for estimating exci- 
tatory and inhibitory synaptic conductances from single trials of 
recordings. 

The organization of the paper is as follows. In Section 
Materials and Methods, we introduce the problem of estimat- 
ing excitatory and inhibitory synaptic conductances (inputs). 
In addition, sufficient details of our proposed algorithms are 
explained in this section. In Section Results, we present our sim- 
ulation results and statistical analysis on the performance of the 
proposed and existing algorithms. Finally, in Section Discussion, 
some discussions and concluding remarks are given. 

MATERIALS AND METHODS 
PROBLEM FORMULATION 

A reasonable neuronal model similar to Paninski et al. (2012) 
represents the dynamics of a single-compartment neuron that 
receives synaptic inputs. The observed membrane potential shows 



the sub-threshold membrane voltage (active channels are phar- 
macologically blocked). This model can be expressed as follows. 

V(t + 1) = 7(t) + dt[g L (E L - 7(0) + g E (t){E E - 7(0) 

+ g I (t)(E I -V(t))] + W (t) 
g E (t + 1) = g E (t) - dt^& + N E (t) 1 ~ ] 

gi(t + I) = gi(t) - dt g -^ +N I (t) 

where V, g£, and gj are the dynamics of the neuron indicating 
the membrane potential and excitatory and inhibitory synaptic 
conductances, respectively, w(t) is white Gaussian noise of vari- 
ance a\,, NE(t) and N/(f) are the instantaneous excitatory and 
inhibitory synaptic inputs to the neuron at time step t (Koch, 
1998; Huys et al., 2006; Paninski et al., 2012), respectively, and 
dt is the time bin that may differ from the voltage recording 
sampling time (Paninski et al., 2012). Note that the time index 
t takes integer values between 0 and T, where T x dt is the entire 
(physical) time of recording. We assume that these time steps are 
equidistant and represent the actual physical sampling duration. 
Similar to Kobayashi et al. (2011) and Paninski et al. (2012), the 
reversal potentials El, Ee, and Ej, the leakage conductance gi, and 
the synaptic time constants xe and t; are known. Note that the 
capacitance of the membrane potential is set to 1 [iF and therefore 
removed from (1). 

Our objective in this paper is to assess the time trace of 
the excitatory and inhibitory synaptic conductances, gE and gi, 
as well as the corresponding synaptic inputs Ne and Nj from 
noisy membrane potential using the known Bayesian approach. 
To optimally reconstruct the time course of the excitatory and 
inhibitory synaptic conductances, we have to determine the prob- 
ability distributions of the corresponding synaptic inputs, as the 
a priori knowledge in the Bayesian approach. Most of previ- 
ous studies used Poisson distribution as the distribution of the 
synaptic inputs (Kobayashi et al., 2011) [see also Paninski et al. 
(2012) that derived PF for the exponential distribution]. Here we 
use a weaker assumption about the distributions of the synaptic 
inputs, namely, the probability distribution function (pdf) of the 
synaptic input can be estimated by a finite number of weighted 
Gaussians — GMM. Moreover, by identifying and tracking each 
Gaussian component with KF, we propose a general GMKF-based 
algorithm. The pdfs of excitatory and inhibitory synaptic inputs 
are given by 

G 

p(N E (t)) = J2 a j^(VE,j{t), £ Bj j(0), N E(t) > 0 
;=i 

G 

p(Nj{t)) = ^a ; AT(^ ; (0, Ej,;(0). Nr(0 > 0 (2) 

where \i E j(t) and [iij(t) are, respectively, the mean of the excita- 
tory and inhibitory inputs at time t that belong to the j mixand 
(j € { 1 : G}). Here, G is the number of mixands. Similarly, Eg At) 
and Sr,;(0 are the time-varying variances of these inputs at 
time t, and a } is the weight corresponding to selecting the 
mixand. Our goal is to estimate Ne(0 and N[(t) in (1) by using 
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the GMM in (2). To this end, we are using extended Kalman 
filtering (EKF) to estimate the dynamics of (1) followed by the 
well-known EM algorithm to infer the statistical parameters of 
the synaptic inputs, |X£ ; (f), [hjAt), £e ,j(t), and Ejj(f) in (2). 
By using these statistics as the a priori knowledge, we repeat 
our algorithm until no considerable changes in the estimated 
dynamics occur. 

PROPOSED ALGORITHM, GAUSSIAN MIXTURE KALMAN FILTERING 

In this subsection, we present GMKF for identifying the exci- 
tatory and inhibitory synaptic conductances of a single neuron 
expressed by ( 1 ) from noisy membrane potential. In the follow- 
ing subsections, we use a notation x(0:t) = {x(0), x(l), . . ., x(t)} 
to represent the time trace of variable x from 0 to t. A special case 
of the GMKF, i.e., KF, will be described later on. Before describing 
GMKF and KF in detail, we introduce a general recursive frame- 
work for estimating (tracking) the parameters (dynamics) of a 
system whose hidden dynamics are represented by a state space 
model. 

General framework 

Figure 1 shows a block diagram of the general recursive frame- 
work for tracking the hidden dynamics and estimating the 



0) k=0, 6 =6 0 



l)Forward filteringto approach: 
E{x(t)\y(0:t)},E{x(t)xmy(O:t)} 



2) Backward filtering (Smoothing)to approach 
E {x(i)\y{0:T)} , E {x(i)x(t) H \y(0:T)} 



3) Inferring parameters, e.g., usingEM algorithm 

max g(8, 9) = max(£{log(/?(y, X | 8))| Y,d}) 

= max([log(p(r,X | 9))p(X I Y)dx) 



if no 



4) set 0 = 0 and k=k+l, 6 
Check stopping criterion 



if yes 



5) Results: 
x(0:T)&8 k 



FIGURE 1 | A block diagram of the general recursive framework. 

Schematic representation of the general recursive algorithm is shown for 
tracking the dynamical states and estimating the time-varying stochastic 
inputs represented by system (3), where x is the state of the system and y 
is the observation. Here, k and Bo are the iteration number and the initial 
values of the statistical parameters, respectively. X and Y are abbreviations 
for the entire samples of x and y over time, i.e., X = x(0:T) and Y = y(0: T). 
8 is the unknown statistical parameters of the system noise and the super 
script H represents the matrix transpose operation. 



(statistical) parameters of a dynamical system, S, 
defined as: 



S: 



x(t + 1) = F[x(f)] + v(r) 
y(t+l)=H[x(t+V)] + e(t) 



which is 



(3) 



where x(t) andyffj indicate, respectively, the state vector and the 
observation at time t, F and H are the transition and observa- 
tion functions, and v(f) and e(t) are the system noise (or the 
unknown stochastic inputs) and the observation noise, respec- 
tively. In Figure 1, 9 stands for the statistical parameters of v and 
e, e.g., the mean and variances. The objective of the recursive 
algorithm shown in Figure 1 is to estimate/track the dynamics 
of S as well as infer the statistical parameters of the stochas- 
tic sources v and e. Although this framework has been used in 
Huys and Paninski (2009) and Paninski et al. (2012), we show 
the effectiveness and usefulness of this framework for estimat- 
ing both the hidden states of a system (in a state-space model 
as well as those modeled as convolution relationship) and the 
statistics of its input. The recursive algorithm begins with an 
arbitrary initiation followed by filtering/smoothing steps (2 and 
3). These filtering/smoothing steps are necessary to identify the 
hidden dynamics S. Accomplishing this step and calculating the 
statistics (mean, variance, etc.) of such dynamics, the parameters 
of the stochastic sources can be inferred by using an appropri- 
ate optimization technique, e.g., the EM algorithm. Since these 
parameters construct the initial values of the next iteration, the 
algorithm can stop with an appropriate criterion. 

GMKF-based algorithm 

In this subsection, according to aforementioned recursive frame- 
work, we derive our algorithm for estimating the excitatory and 
inhibitory synaptic conductances using GMKF. Let x(t) = [V(t), 
gE(t), gi(t)] H denotes the vector of neuronal state at time t. We 
can represent the neuronal model ( 1 ) by the dynamical system (3) 
where the observation function is given by H[x(t)] = Cx(t) with 
a vector C = [ 1, 0, 0] , meaning that only the membrane potential 
is observed. Similarly, the transition function F is given by: 



F[x(t)] = 



L+gE(t)+gi(t)), dtE E , dtEi 
0, 1-^,0 
0, 0, 1 - £ 



V(t)~ 




' dtgiVL ' 


gB<f) 


+ 


0 


gl(t) . 




0 



(4) 



Here, e is a white Gaussian (observation) noise of variance o^, 
and the distribution of the system noise (dynamical noise) v(f) = 
[w(f), N E (t), Nj(t)] H [see (1)] is a GMM containing G mixands. 



P<?t) = ^a/A^-Cf), £ vJ (t)), 
;'=i 



[Lj(t) = [0, \L E j(f), |Al,;(f)] & £ V ,j(f) 



o£, o, o 
0, E B ,„(t), 0 
. 0, 0, Ei,„(t) 



(5) 
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where, otj is again the probability of selecting the j mixand, 
and Ne and N[, which are of our interest, describe excitatory 
and inhibitory synaptic inputs, respectively. Since the distribu- 
tion of the system noise is a mixture of Gaussians, one may 
simply use KF for each mixand. The major drawback of this 
approach is that the number of Kalman filters required to estimate 
the conditional probability p{x(t)\ y(0:t)) increases exponentially 
with time (Kotecha and Djuric, 2003); therefore, computational 
costs of this approach becomes very heavy. However, to eliminate 
this drawback, we use a parallel dynamic state space and resam- 
pling approach (Jayesh et al., 2003). The aim of this approach 
is to keep a constant number of Kalman filters for estimat- 
ing the conditional probability p(x(t)\ y(0:t)) upon arrival of a 
new observation at t. In this regard, the conditional probability 
p(x(t)\y(0:t)) is approximated by K filters. Then, it is obvious 
that K x G Kalman filters are required to represent p(x{t + 1)| 
y(0:t +1)) (see Appendix 1 for more details). Using a resampling 
technique, K filters are again selected to approximate the later 
probability; hence, the number of filters remains constant at the 
arrival of each new observation. Consistent with this description, 
p(x(t) \ y{0:t)) can be expressed as the combination of K parallel 
Kalman filters, as given below, 

K 

p(x(t)\y(0 :t)) = J2 m)P(x(t)\y(0 : t), i) (6) 
;= i 

where p(x(t)\y(Q:t), i) indicates the conditional pdf of the t 
filter and p\(f) is the normalized weight corresponding to the 
z' th Kalman filter at the arrival of a new observation at f. At the 
arrival of a new observation at time instant t + 1, the conditional 
pdf p(x(t + l)|y(0:r + 1)) is given by the K x G parallel Kalman 
filter [since p(x(t + l)|x(f)) is represented by G mixands] . 

K G 

pixit + l)\y(0 : t + 1)) w + W*^ + V\y(° ■ t + l )< t>D 

i=l;=l 

(7) 

where y;j(f+l) is the conditional probability of selecting 
the z'th filter and jth mixand at the arrival of y(t + 1), i.e., 
Yi,;(f + 1) = pihj I y(0:f + 1)). As mentioned above, to avoid 
increasing the number of filters at each new time, we resample 
to select the most K probable filters from the K x G filters used in 
(7). Consequently, (7) can be re-written as 

K 

p(x(t + l)|y(0 :t+l))^J2 W + + W° : t + 0 

;= i 

(8) 

where (Jj(t+ 1) is obtained by selecting the K most significant 
values of y; i(t + 1). In the next section, the KF is derived for each 
i € { 1 : K} and j € { 1 : G} . The final estimation of the states is the 
combination of the results of these filters. 

Kalman forward filtering. In KF, we use a set of mathematical 
equations underlying the process model to estimate the current 
state of a system and then correct it using any available sen- 
sor measurement (Haykin, 2001). In EKF, the first-order Taylor 



linearization of the non-linear process and measurement model is 
used to derive the underlying prediction-correction mechanism. 
Using (1), a priori (predicted) state estimate and error covari- 
ance matrix can be calculated at each f. Moreover, following the 
standard KF for linear time invariant systems, the correction step 
calculates a posteriori state estimate and error covariance matrix 
for this time instant. These variables will be used in the KF recur- 
sive framework for the next time instant t + 1, regarding the 
arrival of a new observation. According to the above-mentioned 
discussions, after combining results from K Kalman filters and G 
mixands at f, we run K x G parallel Kalman filters. Then, resam- 
pling to select K filter is accomplished before the arrival of new 
observation at f + 1. For each i belonging to {1:K} and; belong- 
ing to {1:G}, we aim to calculate the state estimate E{x{ ,;(t)| 
y(0:f)} and state correlation matrix E{x, \j(t)xj ,j{t) H \ y(0:r)} in 
the forward filtering step (see Figure 1) and E{xij(t)\ y(0:T)} 
and E{xii(f)Xi;(f) \ y(0:T)} in the backward filtering (smooth- 
ing) step using KF approach where £{.} stands for the expected 
value and *j ; is the state vector belong to the z th filter and 
j th Gaussian mixand. For the forward filtering step, for each i and 
we can apply the EKF approach (Rosti and Gales, 2001) as 
explained in Appendix 2. After computing E{x h j{t)\ y(Q:t)} and 
E{xi ,j(t)xij(t) | y{0:t)} in the forward filtering step, resampling 
is accomplished to select the most probable K filters. To do so, 
Y;,;(f) has to be calculated for each i, j at t. Then, y;j(t) can 
be easily sorted in descending order and the highest K values 
are selected [corresponding to the most probable state estimate 
E{xi j(t)\ y(0:t)} and state correlation matrix E{x, j(t)x{ j{t) H \ 

y(o-.t)}]. 

Kalman backward filtering (smoothing). In this step, we obtain 
the smoothed state estimate E{xi y(0:T)} and state corre- 
lation matrix E{x^j{t)x^j{t) H \ y(0:T)} and the corresponding 
weights y h j(t) for all i e {1 : K} , j 6 {1 : G} , re{l:T). This 
step is explained in detail in Appendix 3. Calculating E{xij(t)\ 
y{0:T)} and E{x u j{t)x h j{t) H \ y(0:T)} in the backward filtering 
(smoothing) step, we can infer the statistical parameters of the 
system noise v via the EM algorithm. 

Inferring statistical parameters via expectation maximization. 

The EM algorithm is a robust optimization technique for 
inferring the parameters of models involving unobserved data 
(Dempster et al, 1977), e.g., the excitatory and inhibitory synap- 
tic inputs Nj(f) and Nj(f) in this paper. This algorithm is guaran- 
teed to increase the likelihood of the model at each iteration and 
therefore can find a local optimum of the likelihood (Paninski 
et al, 2012). In this section, the EM algorithm is used to infer the 
statistical parameters of (3-5), i.e., the time varying mean ([idt)) 
and the variance of the states (cr^, E v j(t)), and the variance of 
the observation noise (cr~). Having sufficient statistics of the state 
estimate (mean and correlation matrices) of each mixand j and 
filter i (obtained in backward filtering step), we can easily calcu- 
late the final state estimate E{x(t) \ y(0:T)} as the combination of 
the mixtures and parallel filters. 

E{x(t)\y(0 : T)} = x(t) = J2J2 Y^W^M ( 9 ) 
' j 
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where x(t) = I V(t), gE(t), gi(t)j is the state vector estimated by 
KF. Note that for the sake of simplicity of expressing notations, we 
denote E{x LJ (t) \ y(0:T)} by x hJ (t). To use the EM algorithm, it is 
essential to write the joint distribution of the states and observa- 
tion, over time, as follows (X and Y denote the entire samples of 
x andy over time, respectively): 

p(Y,X, i,;|9) = p(Y\X, 6, i,j)p(X\Q, i, j)p(i\Q)p(j\Q) 
logp(7,X, i, j|9) = logp(i|9) + logpO'16) + \ogp(Y\X, Q, 

+ \ogp(X\Q,i,j) (10) 

We want to maximize the log of the joint probability of the states 
and observation via the EM algorithm for each mixture as follows. 

maxQ(6,e) = max (e |log (p(Y,X, i,j\Q))\Y, d\\ (11) 
s.t.e 

= max (^y log(p(Y,X,i 1 j\Q)jp(X\Y,&)dKj 

where 

K G 

p{X\Y) = £ £p(i,j|Y)p(X|U Y) (12) 
;=ij=i 

By doing the corresponding calculations to solve (11) (as 
described in Appendix 4), we can obtain the mean and variance 
of each mixand (for both excitatory and inhibitory inputs). By 
combining them, the total mean and variance of the synaptic 
inputs as well as the observation noise variance are calculated. 
As a result, we can update the statistical parameters of the exci- 
tatory and inhibitory synaptic inputs as well as the variance 
of the observation noise in the M-step (see Appendix 4 for 
full derivation). Inferring all parameters, we can initialize the 
next iteration of the recursive algorithm. The algorithm contin- 
ues until no considerable changes in two consecutive iterations 
occur. 

KF-BASED ALGORITHM 

The simplest case of our GMKF-based algorithm uses a sim- 
ple Kalman filter (G = 1 and K = 1) for the filtering/smoothing 
step. By providing the sufficient statistics in these steps, the 
non-parametric EM algorithm gives the smoothed mean and 
variance (both are time-varying) of excitatory and inhibitory 
synaptic inputs. As a brief description of this algorithm, the pdf 
p(x(t)\ y(0:t)) is approximated by only one Gaussian distribu- 
tion. Therefore, E{x(t)\ y(0:T)} [or x(f), which is given as a 
combination of K x G parallel filters in the GMKF] can be cal- 
culated through the standard KF. This strategy not only reduces 
the complexity of the GMKF-based algorithm but also results in 
highly accurate reconstruction of the excitatory and inhibitory 
synaptic conductances in many cases where the true (unknown) 
distributions of synaptic inputs are nicely approximated by a 
Gaussian distribution. Otherwise (when the true distributions 
are far from Gaussian), the estimated parameters from the EM 
algorithm (which is derived for the Gaussian distribution) give a 



smoothed version of the underlying true means and variances. 
Two issues have arisen from the specific choice of G = 1 and 
K = 1 that we need to clarify. First, the synaptic conductances 
have to be constrained as positive values. Second, the EM algo- 
rithm has to be derived based on truncated Gaussian distributions 
for the synaptic inputs. Note that these would not be an issue if 
G > 1 is used because the probability of having negative synaptic 
conductances naturally decreases with the number of Gaussian 
mixands. It turns out that neither of these issues affects the esti- 
mation of the synaptic conductances in the parameter regime 
studied here. 

The first issue could be easily addressed by using the con- 
strained KF (Gupta and Hauser, 2007), for example, implemented 
in the convex optimization toolbox CVX (Grant and Boyd, 
2012) to penalize the Kalman gain as follows (SDPT3 is another 
MATLAB package for semi-definite problem optimization that 
can be used): 

K c (t) = argmin {trace [(I - K(t)C)'£ t x - 1 (f)(I - K(t)C) H 

+K(t)a 2 e K(t) H ]} (13) 
s.t.D0t~\t)+K<t)e(fi) >0 3x i (14) 

where K c (t) is the constrained Kalman gain at t, x'~ l (t) and 
E£ _1 (t) are the predicted state estimate and state correlation 
matrix at t, respectively, and D is diagonal matrix with the values 
[—1, 1, 1] preserving the negativity of membrane potential (which 
is not necessary and important for results) as well as the posi- 
tivity of the synaptic conductances. According to this constraint 
optimization, the Kalman gain, at each time f, is calculated such 
that the positivity of synaptic conductances is satisfied. It is note- 
worthy that our results show that a simple constraining on the 
(updated) state estimate Dx { (t) ^ 0 (x'(t) = x'~ l (t) +K(t)e(t) 
in standard KF), without applying the constrained optimization 
for calculating the new Kalman gain, K c (t), has shown very sim- 
ilar performance to that obtained by using (13). This means that 
the simple and conventional KF with ignoring negative synaptic 
conductances [zero forcing the updated x'(t) for negative synap- 
tic conductances], which we use here, is a reasonable alternative 
to (13). 

The second issue makes the M-step of our EM algorithm more 
complicated than we have presented for the GMKF-based algo- 
rithm. Here again, we have heuristically found that the standard 
EM algorithm assuming Gaussian distributions of the synaptic 
conductances works very well because the estimated synaptic con- 
ductances rarely take negative values even if the largest noise level 
explored in this paper is applied. 

RESULTS 

NUMERICAL SIMULATIONS 

We have considered two conditions to analyze the performance 
of the proposed KF- and GMKF-based algorithms, i.e., the con- 
ditions with large and small signal-to-noise ratios (SNRs). First, 
we conducted two numerical simulations to demonstrate the per- 
formance of the KF-based (see Example 1) and GMKF-based (see 
Example 2) algorithms with large SNR, similar to Paninski et al. 
(2012), where the variances of system noise (cr^,) and observation 
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noise (crjr) were sufficiently small. Note that estimating excitatory 
and inhibitory synaptic inputs in this condition was relatively easy 
and the results did not depend much on the algorithms used. 
Then, the robustness of the KF- and GMKF-based algorithms 
were verified in three subsequent examples (see Examples 3-5) 
with small SNR, in which the PF-based algorithm (Paninski et al., 
2012) did not perform well. The time step for our simulations 
was 2 ms. In all the simulations our recursive algorithm (for both 
KF- and GMKF-based) ran for 10 iterations, which was consis- 
tent with previously used parameters (Paninski et al., 2012) and 
gave a fair condition to compare our proposed algorithms and 
the PF-based algorithm. Other model parameters used were sim- 
ilar to Paninski et al. (2012) and summarized in Table l.We first 
graphically show the estimation results and later summarize the 
quantifications in Tables 2, 3. 

Example 1. In this example, the mean pre-synaptic excitatory 
and inhibitory inputs were a non-linear function of their synaptic 
fields. 

E(N E (t)) = expfe(O) 
E(Nj(t)) = expfe(r)) 

where %e and ^/ were sinusoidally modulated (5 Hz) input sig- 
nals (amplitude modulation is 1). %j had 10 ms delay relative 
to The synaptic inputs, both excitatory and inhibitory, were 



Table 1 | Characteristics of the neuron model. 



E E 10 mV 

E, -75 mV 

E L -60 mV 
9l 80s 
Tf 3 ms 

x/ 10ms 



generated from a Poisson distribution. The variance of (volt- 
age) system noise (cr^) was negligible and that of observation 
noise (cr|) was 0.5 mV. Obviously, since we used a non-parametric 
EM algorithm, %e and %i were unknown. Figure 2 indicates the 
results of the KF-based algorithm in estimating the excitatory and 
inhibitory synaptic conductances. 

Example 2. In this example, the pre-synaptic mean functions 
were modeled by the absolute value of random realizations of 
Ornstein-Uhlenbeck processes (a white Gaussian noise is filtered 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 
Time (sec) 



FIGURE 2 | Estimating synaptic conductances and inputs given a 
single voltage trace of Example 1 using the KF-based algorithm. Each 
panel shows membrane potential (top), excitatory and inhibitory synaptic 
conductance (second and third from top), and excitatory and inhibitory 
synaptic inputs (fourth and fifth from top), respectively. Black solid lines 
represent true values and the red dashed lines represent the estimated 
ones. The blue dots (in top panel) represent noisy observations of 
membrane potential. The dashed green line describes the membrane 
potential reconstructed from the estimated synaptic conductances. The 
initial values of the KF-based algorithm were set as follows: The 
time-varying means (for both excitatory and inhibitory) were generated 
from a uniform distribution and their variances (for both excitatory and 
inhibitory) were 1 (for all times). 



Table 2 | Statistical analysis of the performances of the GMKF-, KF-, and PF-based (Paninski et al., 2012) algorithms in the example with 
structural synaptic input (specifications of the simulation were the same as in Example 4). 

AlgorithmsVeatures v g E g t 

PF 0.01 24 ± 1 x 10" 3 0.5658 ±5 x 10" 3 0.3046 ± 3 x 10" 2 

KF 0.0031 ±0.2 x 10" 3 0.4106 ± 0.7 x 10" 3 0.2614 ± 0.5 x 10" 2 

GMKF 0.0033 ± 0.2 x 10" 3 0.4611 ± 0.7 x 10" 3 0.2876 ± 0.5 x 10" 2 

The values describe trial means and standard deviations of the normalized estimation error of (15). 



Table 3 | Statistical analysis of the performances of the GMKF-, KF- and PF-based (Paninski et al., 2012) algorithms in the example with 
non-structural synaptic input (Specifications of the simulation were the same as in Example 5). 



AlgorithmsX features 

PF 
KF 

GMKF 



0.0246 ±7 x 10" 3 
0.0233 ± 7 x 10" 3 
0.0147 ±1 x 10" 3 



9e 

0.6678 ±0.4 x 10" 2 
0.6392 ±0.6 x 10" 2 
0.4599 ±0.6 x 10" 2 



9i 

0.6479 ±0.4 x 10" 2 
0.6322 ±0.7 x 10" 2 
0.5811 ±0.6 x 10" 2 



Definition of parameters is the same as Table 2 



Frontiers in Computational Neuroscience 



www.frontiersin.org 



September 2013 | Volume 7 | Article 109 | 6 



Lankarany et al. 



Inferring synaptic inputs using GMKF 



by an exponential filter of amplitude 0.4 and time constant 
1.11ms). The synaptic inputs, both excitatory and inhibitory, 
were generated from the Poisson distribution and the observa- 
tion noise was negligible. For the GMKF-based algorithm, we set 
G = 2 (number of mixands) and K = 4 (number of Kalman fil- 
ters; see our discussion about GMKF setting). The variance of the 
system noise (o^) was negligible and that of the observation noise 
(of ) was 0.5 mV. Figure 3 shows the results of the GMKF-based 
algorithm for this example. 

Example 3. In this example, the mean pre-synaptic input of 
excitatory was a cosine function (amplitude 1 and frequency 
of 5 Hz) and that for the inhibitory was a constant value 
(time-independent). Then, the synaptic inputs were generated 
from a Gaussian distribution of variance 1.5 and 0.05, respec- 
tively, for excitatory and inhibitory inputs. The small variance 



of the inhibitory synaptic input generated a very narrow dis- 
tribution function (almost delta function). The variances of 
the membrane voltage (0^) and observation noise (of) were 
10~ 2 and 5 mV, respectively. These parameters were chosen not 
because they are physiologically realistic but they illustrate dif- 
ferences in the algorithms. Figure 4 shows the results of the 
KF- and PF-based algorithms in estimating the synaptic conduc- 
tances from the observed noisy membrane potential generated 
in this example. 

As can be seen from Figures 2, 3, both the KF- and 
GMKF-based algorithms accurately identified the excitatory and 
inhibitory synaptic inputs. These results are not very surprising 
given the large SNR used in these examples. In fact, the PF- 
based algorithm could also accurately estimate synaptic inputs 
under similar conditions (Paninski et al., 2012). In the following 
examples, we explored cases with a small SNR. 

Figure 4 shows that gE and g\ as well as the membrane voltage 
were better estimated using the KF-based algorithm. It is clear that 
the PF-based algorithm could not track either gE or g[. Figure 5 
shows the distributions of excitatory and inhibitory synaptic 
conductances. It shows that the KF-based algorithm could esti- 
mate the true distributions of gE and gi very well, while the 
PF-based algorithm failed especially for the inhibitory synaptic 
conductance. 

In Example 3, we considered an extreme case in which 
the inhibitory synaptic input had very narrow distribution. 
In this case, the KF-based algorithm (by selecting appro- 
priate initiation, i.e., large enough variance) could effec- 
tively estimate both excitatory and inhibitory synaptic con- 
ductances though the PF-based algorithm completely failed 
(see Figures 4, 5). Under this small SNR condition, the prior 
distribution of synaptic input made an important contribu- 
tion to the results. While the exponential prior distributions 
assumed for the PF-based algorithm tended to underestimate 
the inhibitory synaptic input, the KF-based algorithm could bet- 
ter approximate the inhibitory input by fitting a single Gaussian 
distribution. 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 



FIGURE 3 | Estimating excitatory and inhibitory synaptic conductances 
given a single membrane potential trace of Example 2 using the 
GMKF-based algorithm. Other descriptions concerning this figure are the 
same as those in Figure 2. The initial values of the GMKF-based algorithm 
(G = 2, K = 4) were set as follows: the time-varying means (for both 
excitatory and inhibitory) were generated from a uniform distribution and 
their variances (for both excitatory and inhibitory) were 1 for both mixands 
(for all times). 



0' i i i i i i i i i ns r^'-v 1 V i 'v" 1 %^ i \ 

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

Time (sec) Time (sec) 



FIGURE 4 1 Estimating synaptic conductances and inputs given a 
single voltage trace of Example 3 using the KF-based (left) 
and PF-based (right) algorithms. Other descriptions about the 
figure are the same as those in the Figure 2. The initial values 
of the KF-based algorithm were as follows: the time-varying 



means (for both excitatory and inhibitory) were generated from a 
uniform distribution and their variances (for both excitatory and 
inhibitory) were 5 (for all times). This initial setting (increasing the 
variance) helped the KF-based algorithm to better estimate the 
distributions of the synaptic inputs in this example. 
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Example 4. In this example, the specifications of the synaptic 
inputs were the same as those in Example 1 (amplitude modu- 
lation is 1.5 for both excitatory and inhibitory inputs). However, 
the variances of the membrane voltage {a 2 w ) and observation noise 
(a 2 ) increased to 1CP 2 and 5mV, respectively. Figure 6 shows 
the results of the GMKF- and PF-based algorithms in estimating 
the synaptic conductances from the observed noisy membrane 
potential generated in this example. 

The results of each algorithm in Figure 6 confirm that the 
gE and gi (and therefore Ne and N[) were better estimated 
using the GMKF-based algorithm than the PF-based algorithm. 
It should be noted that the membrane potential was also bet- 
ter tracked using the GMKF-based algorithm. To see how these 
algorithms approximate the distributions of the excitatory and 
inhibitory synaptic conductances, we plotted histograms of gE 
and gi estimated by the GMKF- and PF-based algorithms in 
Figure 7. As can be seen from Figure 7, the approximated his- 
togram of the GMKF-based algorithm better represents the true 
distributions for both excitatory and inhibitory synaptic conduc- 
tances. 

Example 5. In this example, the pre-synaptic mean functions 
were modeled by the absolute value of random realizations of 



FIGURE 5 | Histo 
the PF-based (bl 



FIGURE 6 | Estimating synaptic conductances and inputs given a single 
voltage trace of Example 4 using the GMKF-based (left) and PF-based 
(right) algorithms. Other descriptions about the figure are the same as 
those in Figure 2. The initial values of the GMKF-based algorithm (6 = 2, 



Ornstein-Uhlenbeck processes (same as Example 2). The synap- 
tic inputs, both excitatory and inhibitory, were generated from 
the log-normal distribution of variance 1.2. The variances of the 
membrane voltage (0^) and observation noise (a 2 ) were 10~ 2 and 
5 mV, respectively. Figure 8 shows the results of the GMKF-based 
and PF-based algorithms in estimating the synaptic conductances 
from the observed noisy membrane potential generated in this 
example. 

Similar to Example 4 where the GMKF-based algorithm out- 
performed the PF-based algorithm, Figure 8 indicates that the gE 
and gi as well as the membrane voltage were better estimated 
by the GMKF-based algorithm. The estimated gE and gi using 
the PF-based algorithm could not follow the rapid fluctuations 
of the synaptic conductances. Figure 9 depicts the histogram of 
the true and estimated gE and gj using the GMKF- and PF-based 
algorithms. 

A heavy high-amplitude tail of the distribution of synaptic 
inputs has often been observed in neuronal circuits (Song et al., 
2005; Lefort et al, 2009; Ikegaya et al., 2013). The heavy tail of 
the log-normal distribution in Example 5 (for both gE and gi) 
occasionally produced large synaptic inputs and induced rapid 
changes in synaptic conductances, which the PF-based algorithm 



(red) and 



K = 4) were as follows: the time-varying means (for both excitatory and 
inhibitory) were generated from a uniform distribution and their variances (for 
both excitatory and inhibitory) were 0.5 for the first mixand and 2 for the 
second mixand (for all times). 



500 
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>gram of the excitatory (left) and inhibitory (right) synaptic conductances of the true (blue), estimated using the KF-based 
ack) algorithms in Example 3. 



> > 
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FIGURE 7 | Histogram of the excitatory (left) and inhibitory (right) synaptic conductance of the true (blue), estimated using the GMKF-based (red) 
and PF-based (black) algorithms in Example 4. 
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FIGURE 8 | Estimating synaptic conductances and inputs given a single as follows: the time-varying means (for both excitatory and inhibitory) were 

voltage trace of Example 5 using the GMKF-based (left) and PF-based generated from a uniform distribution and their variances (for both excitatory 

(right) algorithms. Other descriptions about the figure were the same as the and inhibitory) were 1 for the first mixand and 4 for the second mixand (for all 

Figure 2. The initial values of the GMKF-based algorithm (G = 2, K = 4) were times). 




FIGURE 9 | Histogram of the excitatory (left) and inhibitory (right) synaptic conductances of the true (blue), estimated by GMKF-based (red) and 
PF-based (black) algorithms in Example 5. 



could not keep track of. Hence, this result likely applies to the per- 
formance of the GMKF-based vs. PF-based algorithms for heavy- 
tailed distributions in general. As it is clear from Figures 8, 9, 
the GMKF-based algorithm could better track synaptic inputs 
because GMKF (in this example) used two Gaussian mixands that 
provide more degrees of freedom for fitting the log-normal distri- 
bution than only one exponential distribution, which was used in 
the PF-based algorithm (Paninski et al., 2012). 

Theoretically speaking, the PF-based algorithm (Paninski 
et al., 2012) does not perform accurately under small SNR con- 
ditions if the true underlying distributions for synaptic inputs are 



different from the presumed prior distributions [e.g., an expo- 
nential distribution (Paninski et al., 2012)]. Our examples with 
various distributions of synaptic inputs confirmed that the PF- 
based algorithm (Paninski et al., 2012) works well if the variance 
of the observation noise and membrane voltage noise are suf- 
ficiently small. The PF-based algorithm can give approximately 
the same results as the GMKF-based algorithm in this case. 
However, our examples suggest that the PF-based algorithm does 
not accurately estimate synaptic inputs from distributions that are 
not properly approximated by the prior distribution (Examples 
3-5) in noisy systems. Under this condition, the GMKF-based 
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algorithm outperforms the PF-based algorithm due to its capa- 
bility of estimating an arbitrary distribution of synaptic inputs by 
using a GMM. It should be noted that a larger number of mixands 
(G > 2) maybe necessary if the synaptic input distribution is dis- 
similar to a Gaussian distribution: for example, with a very long 
tail. 

Furthermore, as can be seen from Figures 2-4, 6, 8, the recon- 
structed membrane potential (dashed green lines) from the esti- 
mated synaptic conductances are closely overlap the estimated 
membrane potential (red dashed line in the top panels) obtained 
by GMKF and KF (or PF). This suggests that the difference of the 
estimated membrane potential and the reconstructed membrane 
potential based on the synaptic conductances was negligible in the 
range of the noise levels we have examined. 

STATISTICAL ANALYSIS 

In addition to the above-mentioned observations from the simu- 
lation results and in order to compare our algorithms with the 
PF-based algorithm (Paninski et al., 2012), a statistical analy- 
sis was performed in this section. Two types of synaptic inputs, 
namely, structural (cosine function) and non-structural (O-U 
process) were considered to generate the membrane potential. 
Then, each algorithm was applied to 10 trials of these membrane 
potentials. For the numerical simulations with the structural 
synaptic input, the same specifications as in Example 4 were used 
and for the example in which synaptic inputs were generated from 
the O-U process, the same specifications as in Example 5 were 
applied. Tables 2, 3 quantify the performance of each algorithm 
in these examples. For each algorithm, the mean and standard 
deviation (std) of the normalized error over time was calculated 
for V, gE, and gi where the normalized error is defined as: 



J £ [x„(t) - x n (t)] 2 
err(n) = 1 — (15) 

JX>n(0 2 

where, x„and x n are the true and estimated values of the n th trial, 
respectively. The mean and std were calculated over 10 trials, 
err{n)\ ^mo- 
According to these tables, we can conclude that the perfor- 
mance of our KF- and GMKF-based algorithms was better (for 
all parameters) than that of the PF-based algorithm. When the 
synaptic distribution was not heavy- tailed (Table 2), the KF- and 
GMKF-based algorithms had approximately the same perfor- 
mance. However, for a heavy-tailed synaptic distribution (log- 
normal in Table 3), the GMKF-based algorithm outperformed 
the KF-based algorithm. In the GMKF-based algorithm, one 
could use G > 2 (number of mixands) which results in more 
expensive computations. In our simulations, however, G = 2 was 
sufficiently good to provide the balance between computational 
costs and accuracy. For very heavy-tailed distributions the higher 
the value of G was the better accuracy was obtained for estimating 
synaptic inputs. Note that the simulations of (G = 2 and K = 4, 
i.e., eight filters for each time) only took approximately the same 
running time as the PF-based algorithm. Moreover, we observed 



that K = 2, 3, or 4 [number of filters used for estimating p(x{t) \ 
y(0:t))] did not change the final results considerably. As a rule 
of thumb, we concluded that, K = G is a good choice for select- 
ing the value of K. It should be noted that when G = 1 and K > 
1 (the system noise is approximated by only one Gaussian dis- 
tribution) it is called Gaussian sum filtering [Kalman or particle 
can be applied, see Kotecha and Djuric, (2003)]. In this case, 
the conditional probability p(x(t)\ y(0:f)) is estimated using K 
Gaussian filters. However, our case with G > 1 and K > 1 is 
called Gaussian mixture filtering. In fact, in this case, G > 1 forces 
K to be greater than one in order to better approximate p(x(t)\ 
y(0:t)), whose filter number grows exponentially overtime [see 
(Kotecha and Djuric, 2003)]. Note that the number of filters G 
and K can be chosen based on a standard model selection crite- 
rion such as Akaike's or Bayesian information criterion (Akaike, 
1974; Burnham and Anderson, 2002) with real experimental data. 

As the main conclusion of our results, we found that our pro- 
posed KF- and GMKF-based algorithms outperform the PF-based 
algorithm. Generically, the GMKF-based algorithm offers a more 
powerful estimation method than the KF-based algorithm by pro- 
viding higher degrees of freedom to fit synaptic inputs. However, 
it is noteworthy that even the KF-based algorithm gives simi- 
lar results to the GMKF-based algorithm unless the underlying 
synaptic input distributions are complex. In many cases, the KF- 
based algorithm is not only much simpler than both the GMKF- 
and PF-based algorithms but also much faster: therefore, more 
efficient. 

DISCUSSION 

We have proposed a recursive algorithm based on GMKF for esti- 
mating the excitatory and inhibitory synaptic conductances (and 
inputs) from single trials of noisy recorded membrane poten- 
tial. Two other methods have been proposed in the literature 
along this direction. Quick alternation of membrane potential 
between excitatory and inhibitory reversal potentials (Cafaro and 
Rieke, 2010) enabled nearly simultaneous reconstruction of exci- 
tatory and inhibitory synaptic conductances from single trials. 
One advantage of our method compared to this approach is that it 
does not require rapid alternations of membrane potential, which 
might cause experimental artifacts. Thus, our method provides 
wider applicability to existing as well as future experimental data. 
Another approach is to infer excitatory and inhibitory synaptic 
conductances by using the oversampling method (Bedard et al., 
2012). Unlike this approach, our KF/GMKF algorithms do not 
require the manual adjustment of oversampling time steps to 
suppress singularity problems. The main advantage of our meth- 
ods in comparison with Kobayashi et al. (2011) and Paninski 
et al. (2012) relies on the fact that it has the flexibility to esti- 
mate an arbitrary (and unknown) pdf of the synaptic inputs by 
using a GMM. Moreover, we derived and tested a special case 
of the GMKF-based algorithm when there is only one mixand, 
i.e., the Kalman filter, for estimating the excitatory and inhibitory 
synaptic conductances. The simulation results have demonstrated 
the accuracy and robustness of the proposed algorithms in noisy 
conditions for estimating synaptic inputs generated from differ- 
ent distributions. In this regard, we have found that the GMKF- 
and KF-based algorithms outperform the PF-based algorithm 
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(Paninski et al., 2012). We have also found that the GMKF- and 
KF-based algorithms have approximately identical performances 
in many cases where simple distributions of synaptic inputs are 
assumed. On the other hand, the GMKF-based algorithms pro- 
vide much more accurate estimation than the KF-based one when 
synaptic inputs are drawn from heavy-tailed distributions with 
many strong synapses. In practice, running both KF-based and 
GMKF-based algorithms and comparing their results should pro- 
vide an idea on how complex the underlying distributions of 
synaptic inputs are. Therefore, the simplicity and high speed of 
the KF-based algorithm as well as the robustness and general 



applicability of the GMKF-based algorithm make them efficient 
techniques for neuroscientists to monitor trial-to- trial variability 
of the excitatory and inhibitory synaptic inputs. 
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APPENDIX 1 

In order to show how the number of mixture Kalman filters grows exponentially with the arrival of a new observation y(t), we assume 
thatp(x(f — 1)| y(0:t — 1)) can be represented by K Kalman filters as follows. 

K 

p(x(t - l)|y(0 :t-l)) = J2 Pi(t - l)p(.x(t - 1)|;K0 : f - 1), i) (Al.l) 

!= 1 

where p,(f — 1) is the normalized weight corresponding to the i th Kalman filter at time t — 1. Then,p(x(f)| y(0:t- 1)) will be given by 
the following equation considering thatp(x(f)| x{t — 1)) is approximated by G mixands. 

G K 

p(x(t)\y(0 : t - 1)) = J2J2 a M f ~ l)P« f )tv(0 : t - 1), (A1.2) 

;=li=l 

Now p(x(t)\ y(0:t)) can be calculated using the Bayesian rule. 

pOWIrCo : f - i)) 

G K 

£ I] - Dp(y(OI*(0, i,j)p(x(t)\y(0 :t-l), 

_ i = i i=i 

p(y(f)|y(0:f-l)) 

G K (A1.3) 
J2J2 a M f - DPWOWO :t-l),i,j)p(x(t)\y(0 : f), 

= i=i'=i 

p(yW|y(0:t-l)) 

g ;c 
i=n=i 



where 



yy(t) =p(x',jl7(0: t)) 

aj(0P,-a-i)p(y(0ly(0:^ 

a Jf G (A1.4) 

i=ij=i 



and 



p(y(f)|y(0 : f - 1), i, ;) = J p(y(f)|x(r), i, ;>(x(t)|y(0 : » - 1), i, j)dx(t) 

P p(y(t)\x(t),i, j)p(x(t)\x(t - \),j)p(x(t- l)|y(0 : t - 1), i)dx{t)dx{t - 1) 



II' 



(A1.5) 



Although (A1.5) can be easily obtained using the results of KF (see Appendix 3), (A1.3) indicates that p(x(t) \y(0:t)) is approximated 
by K x G filters. Consequently, p(x(t + l)|y(0:f +1)) will be represented by K x G 2 filters and the number of filters increases expo- 
nentially with the arrival of each new observation. As mentioned in Section Results, this problem can be overcome by resampling. To 
do this, it is required to select the most probable K filters of (A1.3) from Yi,;(f)- m this regard, p(x(t)\y(0:t)) will be approximated by 
only K filters as follows. 

K 

p(x(t)\y(0 : t)) w ^PiWpWOIKO : 0, 0 (A1.6) 

i= l 

Where P;(f) is the weight corresponding to the K highest values of Yij(t). Since p\(f) is normalized after the resampling process, it 
gives equivalent weights to all selected filters (I IK for each filter). Please note that the resampling process is done only to eliminate 
the increase in the number of filters required for approximating p(x(t)\y(0:t)) at the arrival of new observation y(t). Hence, K x G 
Kalman filters are needed to run at each time t to compute p(x(t) |y(0:f)) for each new observation y(t). 
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APPENDIX 2. KALMAN FORWARD FILTERING (SEE Rosti and Gales, 2001) 

In this appendix, K x G Kalman forward filtering is derived for each i belonging to {l:K} and j belonging to {1:G}. The standard 
KF technique includes two main steps: 1-time update and 2-measurement update (Haykin, 2001). In the time update step, for each i 
and;', the predicted state estimate E{X{ j(t)\ y(0:f — 1)} and predicted state correlation matrix E{xij(t)xij(t) H \ y(0:t — 1)} (therefore 
the state covariance matrix, E{xij(t)xij(t) H \ y(0:t - 1)} — [E{x{ ;(f)l y(Q:t — l)}) 2 ] are calculated using (1) [or (3), i.e., the system 
dynamic]. Then, in the measurement update step, the modified state estimate E{xij(t)\ y(0:t)} and modified state correlation matrix 
E{xij(t)xjj(t) H \ y(0:t)} are updated using the recursive Kalman framework. In the following equations, for the sake of simplicity in 
representing the notations, E{xjj(t)\ y(0:t — 1)} and E{xjj(t)\ y(0:t)} are denoted as x\^{t) andxf At), respectively. Accordingly, the 

predicted and updated state covariance matrices are denoted as At) and ■ At), respectively. For each i and;', the time update 
step can be followed by 

x*r.\f) =A{f)x\-\t-l) + \i %j {t- 1) 

( A2J ) 

K7i,)w = A ^K7, ( f - v A (t) H + Svj 

where A(t) = \ x ( t ) is the time dependent transition matrix, and [i v j and E v ; are the mean and variance of the synaptic input 
corresponding to / h mixand. The so called Kalman gain is then obtained by 

K hj (t) = E^ i(t)C H E-) ; (f) (A2.2) 

where 

E W . = C£ f ~ \{t)C H + al (A2.3) 

Here, C is the observation vector [1, 0, 0] and a\ is the observation noise variance. By defining the innovation process as, e h j(t) = 
y(i) — Cx[y 1 (f), the updated state estimate and state covariance matrix are calculated as follows. 

x\At)=x , i - i l (t)+K h j(t)e u j(t) 

' , , (A2 - 4) 

s i i ,( ( ) = s:->)-^ J wc^>) 

Since the KF-based algorithm is a recursive algorithm, (A2.4) is used for the next iteration and x* -{t) (or E{xij{t)\ y(0:t)}) is estimated 
for the all time samples of the observation data y(0:T). 

APPENDIX 3. KALMAN BACKWARD FILTERING (SMOOTHING) (Rosti and Gales, 2001 ) 

Similar to Appendix 2, Kalman backward filtering is accomplished for K x G Kalman filters at each iteration, f. The resampling 
procedure is already performed to eliminate the increase in the number of filters estimating the p(x(t) \ y(0:t)). The goal with backward 
filtering is providing a better estimate of the state mean E{xjj(t)\ y(0:T)} and the state correlation matrix E{xjj(t)xi t j(t) H \ y(0:T)} 
using all observed datay(0:T). Again, for the sake of simplicity, E{xij(t)\ y(0:T)} and the state covariance matrix [E{xij(t)xi tJ (t) H \ 
y(0:T)} — (£{«,•_ ,-(f) | y(0:T)}) 2 ] are denoted as Xij(f) and Y, x j j(t), respectively. Following the standard Kalman backward filtering 
algorithm, for each i belonging to {1:-K"} andj belonging to {1:G}, we have: 

*;J 1 = <7 1 (f ~ D + h,(t - i) (k,(t) - <7 1 (f)) (A3.1) 

where 

Ji,j(t - 1) = ^-^(t - l)A{t) H (e^'(0) _1 (A3.2) 
The state covariance matrix is calculated as follows. 

t Xi hj (t - 1) = tf-^it - 1) + Ji,j(t - 1) (£ x , i:j (t) - ^ x jj(t)) (J i:j (t - l)f (A3.3) 



and E 



t - i )(t) = ± XtiJ (t)(j iij (t-l)) H . 
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For time indices, since in backward filtering the initial values start at t = T, we have 

Xi,j(T) = xfjiT), ±l UJ (T) = V T xlj {T),R h ,{T) = t T x i j {T) + x i , j {T)x i , j (T) H 



(A3.4) 



Moreover, it is necessary to obtain the correlation matrices E{X{ : j(f)Xi ,;(f) H | y(0:T)} and E{xjj(t)xij(t — 1) H \ y(0:T)} 
for the expectation maximization (EM) algorithm. These matrices are, respectively, denoted as (f) and Rij(t) in 

our equations. 



R xJtj (t) = ^ h j(t)+x h j(t)(x hJ (t)) H 

R x ~ 1 (0 = K7i)V) + kj(t)(k,(t - 

Having the sufficient statistics, we can use the EM algorithm to infer the statistical parameters of each mixture. 



(A3.5) 



APPENDIX 4. EXPECTATION MAXIMIZATION (EM) ALGORITHM FOR THE GMKF-BASED ALGORITHM 

Recalling (10) and (11), we want to maximize the log of the joint probability of the states and observation via the EM algo- 
rithm for each mixture. Then the results are combined to yield the final estimation of the states as well as the distributions of the 
synaptic inputs. 



maxQ s t g(6, 6) = max J J £>(x, log (p(Y,X, i, j|0)) p(X\i,j, Y)dX ^ 

^p(i,iiy)iog(p(7,x,i,j|e))|7,e 



max E-, 



pa/i,j, Y) 



(A4.1) 



where X and Y stand for all the states and observation over time, respectively. The expected value of the joint probability in (A4. 1 ) can 
be expanded as follows. 



b p(X/n,m,Y) 



J]p(j,j|7)log(p(7,Jf,i,;|9))|7,e 



(A4.2) 



= J2Y1 : Y) { logoff) + lo g p,(f) + logp(y(t)\x(t), 6) + \ogp(x(t)\x(t), 6)} 

i j t=i 

T T 

= Yl J2 I>(M'W° = t))E p(x/l ,j, Y ) { logoff)} + I^PftilKO : t))E p(x/l ,j, Y) { Iogfc(f)} 

i j t= i i j t=\ 

T 

+ E ED ft ^ 0 : \-\ logo? + (y(t) - x(f)) H (o^) _I (y(f) - *(*)) j 

i j t=i 1 1 

r 

+ I]EI>(i>il>'(0 : t))E p( x /itjtY) 

i j t=l 

(-i logE v , ilJ - + (x(t)-Ac(0-H'W(t " l)) H (E»,^) -1 (3c(0-Ae(f-l)-|ii,j(t - 1)) 

Considering (A4.2) to solve (A4.1), we need to calculate Ep(x/i,j, y) W*) }, Ep(x/i,y, y){*(f — l)*(f) H } and £p(x/i,j, y) {*(£) *(f) H }- 
These statistics are already calculated in the backward filtering step as indicated by R^it), and respectively. 
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Furthermore, p{i, j\ y(0:t)) is already defined in (A1.4) and can be simply computed as follows [calculating the double integral in 
(A1.5)]. 

p(y(t)\y(0 : t - 1), = j p(y(t)\x(t), i, j)p(x(t)\y(0 : t - 1), ij)dx(t) 

= f f p(y{t)\x(t),i,j)p(x(t)\x(t-l),j)p{x(t-\)\y{Q-.t-\),i)dx(t)dx(t-\) 



j p(y(t)\x(t), i, j)N(x(t); ^ /(f), t'^ 1 (t))dx(t) 
N-(/;Gjc[ y (0,^ + Ct<,j(t)C H ) 



(A4.3) 



And p(i, j\ y(0:t)) can be estimated as 



p(i,j\y(P : f)) = Y;,j(f) 

aj(t)Ut-l)p(y(tWO:t-l),i,j) 



(X 



K G 



(A4.4) 



££o,-(f)pi(f- l)p(y(t)\y(0 : f- 1), 
i=iy=i 



Note that f$, (f — 1) is the normalized weight after resampling and is equal to II K for all t. Now we can derive theM step by taking 
the derivative of (A4.2) due to the unknown parameters [a, [L v , E v , o\]. It should be noted that it is convenient to demonstrate the 
optimized parameters in the EM algorithm as [a, yi v , E v ,rjg]. For a,-, one can write 



maxQ(9, 9) = max ^ ^ YijW^x/i,;, Y) { logoff)} 



u.8 



(A4.5) 



and the fact that £^ a; = 1. Using Lagrange multiplier we can solve the following optimization. 



'j <= i \ j / 



(A4.6) 



which results in solving the following set of equation. 



K T 



;=if=i ; 
£>-l = 0 



K T 



-1 t=l 



(A4.7) 



The optimized |i, ; can be calculated by taking the derivative of (A4.2) due to this parameter, as given below: 

3fcl]Yi,;(0%X/<,j,y) {-i log Ey. i,j + (x(t)-Ax(t-l) - (L,,j(t- l)f (S v , y) -1 (*(t) - Axf t - 1) - («,/(*-!))} 



3Q(6, 9) 
9jx 



3|x 



= (E v , «,;■) 1 £ £ Yi,/(f) {hj(t) - Axij(t - 1) - (x,, ; (t - 1)) 



(A4.i 
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As we want to calculate jljj for each i, j, we can rewrite (A4.8) as 

3Q(9, 9) 



0 =► E Y,,,(0 (%;( f ) " Mt)hj(t ~ 1) - " *)) = 0 



T 



f=2 

Kj = 



f = 2 

Obviously, (A4.9) can be obtained for each mixand j as follows (by combining K filters). 



K T 



K EE v >'j (t) foj® ~ A ^ k ',j(t - 1)) 

H = E hi = jr~r (A4 - 10) 



!= 1 f = 2 



In the non-parametric approach we are using the goal is to estimate the time-varying mean (and variance) using a (defined) basis 
function X (spline basis function in this paper). As a result, the time-varying means E{N E ,j(t)} and E{Nij(t)} can be estimated as 
follows (recall that |xyC*) = [0, E{N E ,j(t)}, E{Nij(t)}]). 



E Y«.;( f ) (h, ijO) ~ a EgE, i, jit ~ !)) 



E{N Ed (t)} 



Ey»,;(0 
i = i 

K 



' (A4.ll) 



;= i 



where g£ andg; are the second and third component of the state vector x, respectively. To model E{NEj{t)} = X(t)u> E ; and£{N/j(t)} = 
X(t)w h j, we need to find the weighting vectors, u> E j and a>, j, of the basis function X as follows: 

coe, ; = (X T Xr 1 X T E{N Ej } 

(A4.12) 

Q) Zii = (X T X)- 1 X T E{N lj } 

where E{N E j} and A indicate E{N E j(t)} and , ;(f)} over the entire time. Similar to Paninski et al. (2012), the basis function 
X consists of 50 spline basis functions. 

Similarly, the covariance matrix £ v ,i,i can be inferred by taking the derivative of (A4.1) due to this parameter. 

3 -\ logS v , k j + (*(t) -A(f)x(t-1)- (Li, j(t-l)) H (%,,.,) 1 (x(t) - A(t)x(t - I) - f H] (t - I)) 

9Q(9, 9) _ \ i,j t=i [ 1 



i r 

-EE YiW ( S ". M - (*>'.;'W " A W»i,;(f - D - (H;0 " 0) (*.,/« - A(f)i,, ; (f - 1) - fl;, ; 0 - 1)) H ) (A4.13) 



U f=i 
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Then, for each i and j we have 
T 



3Q(9, 6) 1 



9E„ 



-l 



E Y., ; (f) (ty.j ~ (kj(t) ~ A(t)Xij(t - 1) - ji;,^ - 1)) (x hj (t) - A(t)k hj (t - 1) - $,ij(t - = 0 



t=i 

T 



^V, i,j 



^ ] Y/7, m (0 



(A4.14) 



Similar to (A4.9), (A4.14) can be obtained for each mixand j as follows, 

K T 



: _ l + — T \ ' 



i=l f=2 



if T 
i=l r=2 

The time varying variances of the excitatory and inhibitory inputs can be expressed as: 

K 



i= 1 



E*.i« 
i=i 



;= i 



R,,j(t-l) gi,i,j(t) 



i= 1 



(A4.15) 



(A4.16) 



where R,.j(t) | stands for the second row and second column of the matrix Ri,j{t). In our non-parametric model, E^ j(t)= X(t)\E,j 
andE^ y(f)= _X (f)X,j. The weighting vectors Vej and X; ; of basis function X can be given by 



x L] = (x T xy 1 x T ± lj 



(A4.17) 



where Ee ; and E/, j indicate Eg, j(t) and Ej ;(£) over the entire time, respectively. Hence, the system noise (including synaptic inputs) 
can be represented by a GMM as follows. 



vCO-E^O^WA;^)) 



(A4.18) 



where |ij(f) and E v ,(f) are defined by (5) and their components are calculated in this Appendix. Moreover, as mentioned before, we 
can simply calculate the final state estimate x(t) as the combination of JC x G parallel filters at each time f. 



x(t) = Ey«,jO)*U« 



(A4.19) 



Note that, the variances of the membrane voltage (cr 2 ^) and observation noise (cr 2 6 ) can be calculated in a straightforward way, as 
mentioned in Paninski et al. (2012). 
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